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TRANSLATED POISSON APPROXIMATION USING 
EXCHANGEABLE PAIR COUPLINGS 

By Adrian Rollin 1 
University of Zurich 

It is shown that the method of exchangeable pairs introduced by 
Stein [Approximate Computation of Expectations (1986) IMS, Hay- 
ward, CA] for normal approximation can effectively be used for trans- 
lated Poisson approximation. Introducing an additional smoothness 
condition, one can obtain approximation results in total variation and 
also in a local limit metric. The result is applied, in particular, to the 
anti-voter model on finite graphs as analyzed by Rinott and Rotar 
[Ann. Appl. Probab. 7 (1997) 1080-1105], obtaining the same rate of 
convergence, but now for a stronger metric. 

1. Introduction. Let W be a random variable with 

(1.1) EW = fj, and Var W = a 2 < oo. 

Stein [17] introduced a method (which is commonly called the exchangeable 
pairs approach) to approximate W c := (W — fj,)/a by the standard normal 
distribution; Rinott and Rotar [14] then generalized the result and suc- 
cessfully applied it to weighted ^-statistics and the antivoter model. Their 
results imply convergence to the standard normal distribution in the Kol- 
mogorov and even in some stronger metrics; however, in this context, they 
do not provide approximations in the total variation metric or prove local 
limit-like results. 

We will consider such results in this paper in the special case, in which 
W is integer valued, the most common situation being the one where W is 
a sum of random indicators. As the total variation distance between W and 
the normal distribution will always be 1, we will instead use a translated 
Poisson distribution as approximation, having the same support as W and 
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matching the first two moments of W as well as possible. If not otherwise 
stated, we will assume throughout that a 2 — > oo. This actually implies that 
the approximating probability distributions are not converging to a limiting 
distribution, but the accuracy of our approximations nonetheless increases 
as a 2 becomes large. The total variation metric is invariant under scaling, 
so that working with W c would bring no benefit. Besides the total variation 
metric, we will also consider a metric from which local limit approximations 
can be obtained. 

We note that in the framework of Stein's method there are other ap- 
proaches to replacing the normal distribution by discrete analogues. In [12] 
a distribution with support on Z is constructed, with the advantage of having 
no truncation and rounding effect but at the cost of a somewhat more com- 
plicated Stein operator. There, approximation theorems are provided using 
the so-called zero biasing approach introduced in [9]. In [16] a translated bi- 
nomial distribution is used, and in [4] a special translated signed compound 
Poisson distribution, both in the context of the so-called local approach. In 
the discrete setting, exchangeable pairs have also been successfully used in 
[8] for Poisson approximation in total variation. 

The rest of the paper is organized as follows. In the next section we recall 
the setup of the exchangeable pairs approach in the context of normal ap- 
proximation. We then introduce a simple smoothing condition under which it 
is possible to obtain the stronger total variation bounds for translated Pois- 
son approximation. In Section 3 we state and prove the main approximation 
theorem, Theorem 3.1, which is the discrete equivalent to Theorem 1.2 of 
[14]. We also prove a second general result, Theorem 3.11, from which, un- 
der an additional assumption, more accurate rates can be obtained for local 
limit results. In Section 4 some applications are given, among others to the 
anti-voter model. 

2. Exchangeable pairs for normal approximation and a smoothness condi- 
tion. We call a pair of random variables (W, W) exchangeable if Jzf ( W, W) = 
Jz? (W , W). As in [17] and [14], assume now that there is a positive number 
A < 1 and a random variable R such that 

(2.1) E W (W -») = (!- \)(W-fj,) + R, 

holds, where E w denotes the conditional expectation with respect to W. Of 
course, one can always find R to satisfy (2.1), so R must be thought of as 
being small for the approximation to be successful. Note that (2.1) implies 
Ei? = 0. 

If the pair (W, W) can be chosen such that condition (2.1) holds and 
E W (W' - W) 2 does not fluctuate too much, convergence of W c to the stan- 
dard normal distribution will follow in the Kolmogorov metric. As the be- 
havior of the difference W' — W is mainly responsible for the quality of 
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the approximation, it is an obvious starting point to introduce a smooth- 
ness condition, to make sure that the local perturbations of W are not too 
strong. 

Rinott and Rotar [14] propose to choose W and W as two successive 
steps of a reversible Markov chain with stationary distribution J?(W). Then, 
condition (2.1) states that a particle on Z obeying the transition rules of such 
a Markov chain is forced to have an (almost) linear drift to the center. Now 
K w=k (W' — W) 2 is the average of the squared jump size of the Markov chain 
if the particle is in k, so that, for a good normal approximation, the average 
jump size of the particle must not fluctuate too much with varying k. It is 
clear that, under these conditions, the particle may still behave irregularly 
on a local scale, for instance, the particle could still make only jumps of size 
two and thus stay on the odd or even integers, such that an approximation 
with a distribution on Z with span 1 will not be successful in total variation. 

Thus, in addition to (2.1), we assume further that 

(2.2) W' -W e{-l,0,+l}, 

and we will see that this seems to be an appropriate condition. Note that 
under condition (2.2) the corresponding Markov chain does not need to be 
reversible for (W, W') to be a exchangeable pair; see Lemma 1.1 of [14]. 

Condition (2.2) is in sharp contrast to other approaches using Stein's 
method for the translated Poisson distribution such as [7, 15] or [3], where 
an embedded sum of independent random variables within W is used for an 
explicit smoothing argument; in contrast, the smoothing effect of (2.2) will 
enter only implicitly into the proof of the main result. As we are restricted 
to the integers, we cannot arbitrarily shift a Poisson distribution with a 
given variance to fit the mean, so some care is needed here. We say that an 
integer valued random variable Y has a translated Poisson distribution with 
parameters /x and a 2 and write 

^(y) = TP( / u,a 2 ) 

if «Sf (Y — + a 2 + 7) = Po(er 2 + 7), where 7 = (fj, — a 2 ) and (x) = x — [x\ 
denotes the fractional part of x; in particular, TP(ct 2 , a 2 ) = Po(cr 2 ). bo, ap- 
proximating W with TP(/i,(j 2 ), we can fit the mean exactly, but note that 
for the variance we have a 2 < Vary = a 2 + 7 < a 2 + 1 . This will, however, 
cause no further problems as the order of error of this mismatch is 0(o~~ 2 ); 
see also Remark 3.5 below. 

Throughout the paper, we shall be concerned with two metrics for proba- 
bility distributions, the total variation metric dry an d the local limit metric 
d\ oc , where, for two probability distributions P and Q given by the point 
probabilities {pk, k S Z} and {q k , k G Z} respectively, 

d TY (P,Q) := sup \P{A) - Q(A)\ = \ £ \p k - q k \, 
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d\ oc (P,Q) := sup|p fc -q k \. 
3. Main results. 

Theorem 3.1. Assume that (W, W) is an exchangeable pair with values 
on the integers and which satisfies (1.1), (2.1) and (2.2). Then, with S = 
S(W) = F[W = W + 1\W) and g max = m&x keZ F[W = k], 

, ™/ TT .N r^-^y 9,, VVar 5 2\/Var R 2 

Acj z Act a z 

fn r*\ , f TD/ 2nn ^ 2 v / g max VarS 2q max War J R 
(3.2) dio (JSf (W), TP(^, (T )) < —2 + — 

VVari? 2 
+ Act 2 + ^ " 

Remark 3.2. In some of the applications, instead of S(W) = F[W' = 
W + 1| W], we will estimate the variance of a more general random variable 
S* = S*(X) := F[W = W+ 1\X] for some random variable X such that the 
corresponding cr-algebras satisfy a(W) C cr(X) and then use the basic fact 
that VarS< VarS*. 

Example 3.3. To illustrate the above theorem, we apply it to the 
Poisson-binomial distribution. To this end let J = (J\, . . . , J n ) be a sequence 
of independent indicators with EJj =pi and W = Ya=i Jit thus, /i = Yl7=iPi 
and a 2 = YA=iPi(^ ~Pi)- We use the standard construction of [17] to obtain 
an exchangeable pair. Let J*,. . . ,J* be independent copies of the Jj and let 
K be uniformly distributed over {1, . . . , n}. Then, with W' = W — Jk + Jr, 
it is easy to check that (W , W) is an exchangeable pair, satisfying (2.1) with 
R = and A = 1/n and, clearly, (2.2) is also satisfied. So, 

S*(J) : = K J I[W' - W = l] 



1 n 

-^E J /[j i = o,j; = i] 



(3.3) 



n . 
i=i 



1 n 

-^(i-j,)e j j; 

n r— 1 

i=i 

1 n 

i=l 



Thus, Var5* = n Ys?=iPiO- ~Pi)> an< ^> ^y Remark 3.2, (3.1) yields 



(3.4) d TV (J^),TP(/,,a 2 )) < 2+ ^ (1 

l^i Pi \ 1 Pi) 
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Assume now that the pi are bounded away from and 1, so that a 2 x 
n as n — > oo. Then (3.4) is of the correct order 0(n -1 / 2 ). This also im- 
plies that g max = 0(n -1 / 2 ) (see Corollary 3.9 below) so that (3.2) yields 
di oc (J2?(W^),TP(/i,<7 2 ) = 0(n~ 3 / 2 ), which in contrast is not optimal. We will 
improve this bound using Theorem 3.11 below. 

In the above case of the Poisson-binomial distribution, Corollary 2.1 of 
[7] seems to be better in constant than (3.1). For instance, for the binomial 
distribution, we have 



d T v(Bi(n,p),TP(^,a 2 )) < Cj ./ . + 



n(l — p) np(l — p) ' 
where [7] obtain C = 0.93 and (3.1) yields C = 1. 

Remark 3.4. Theorem 3.1 is a direct analogue of Theorem 1.2 of [14]. 
However, the first term in (3.1) is slightly different in quality from Theorem 
1.2 of [14], as can be seen by comparing the result of their Theorem 1.3 for 
the anti-voter model with estimate (4.4) below. 

REMARK 3.5. The additional 2/cr 2 in (3.1) and (3.2) appears because 
the Poisson distribution cannot take negative values, and because the trans- 
lation must be integer valued. Depending on the problem at hand, this er- 
ror term can be further reduced or even be omitted by replacing estimates 
(3.12) and (3.17) in the proof below. For example, to obtain the best possi- 
ble total variation estimates from (3.1) in the Poisson-binomial case, recall 
from Section 2 that 7 = (fj, — a 2 ) = (J2p1) arid s = \ji — a 2 \ = [i — a 2 — 7 = 
I>i - (EPi)- From (2.8) of [7] we obtain for (3.12) 

<e~ a2 / A , ifs>0, 
= 0, if s = 0. 

For the last term in (3.16), we have 

|E 7 A5(^)|<||A5||(]>>f 

Using the first inequality of the estimate of ||A^|| in (3.13) and applying this 
in the above estimate and also in (3.23), we obtain 

d TV {J?{W),TV{n,a 2 )) 



[W <s] 



+ 4Ep!>i 



-(T 2 /4 



This estimate now covers also the regime of Poisson approximation. However, 
(3.5) is larger in constant than previous results and one would have to go 
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back to the proof of the theorem to reproduce the bounds of [1] and [8]; see 
Remark 3.8. 

Remark 3.6. As becomes clear from equation (3.22) in the proof of 
Theorem 3.1, there is a close connection between the random variable S = 
S(W) and the so-called w-functions as examined, for example, in [5] and [6] 
for the normal and the Poisson distributions. In the case of the standard 
normal distribution, their problem is as follows: for a given random variable 
X with EX = and Var X = 1 , find a function w : R — > R such that 

(3.6) E{Xf(X)}=E{w(X)f'(X)} 

holds for a large set of functions /. For the translated Poisson distribution, 
the corresponding equation is 

(3.7) E{(W-fi)f{W)} = E{w(W)Af{W)}, 

and it is indeed satisfied for any W as in Theorem 3.1 if R = and if 
we choose w{W) = S(W)/X. Unfortunately, it is often difficult to give an 
explicit expression for S as a function of W. However, if we allow w(W) in 

(3.7) to be replaced by a more general random variable, we see from (3.22) 
that we can use the random variable S*(X)/X from Remark 3.2 instead. For 
instance, for the anti- voter model as discussed in the next section, S*(X) 
has the nice and explicit representation (4.7). 

Instead of (3.6), one can also formulate the problem of finding a random 
variable X z such that 

(3.8) E{Xf(X)}=Ef'(X z ), 

which leads to the so-called zero biasing approach. There is a close connec- 
tion between this and the exchangeable pairs approach; see [10] and refer- 
ences therein, and for more general versions of (3.8), see [11]. 

Before proving Theorem 3.1, we give a short introduction into Stein's 
method for translated Poisson approximation. The starting point is the 
Stein-Chen method for the Poisson distribution as presented in detail by 
Barbour, Hoist and Janson [2]. 

Let W satisfy (1.1) and let s = [fi — a 2 \ and 7 = (fj, — a 2 }, where (x) = 
x — \ x\ denotes the fractional part of x. Note that, if Y ~ TP(/x,<7 2 ), then 
Y — s ~ Po(<t 2 + 7). Let Ag(j) = {a 2 + -y)g(j + 1) - jg(j) be the usual Stein 
operator for the Poisson distribution with mean a 2 +7, and for A C Z + : = 
{0, 1,2,...}, let gA '■ Z — > R be the solution of the following: 

(3.9) (i) g(j)=0 for all j < 0, 

(3.10) (ii) Ag(j)=I[jeA]-Po(a 2 + 1 ){A} for all j > 0. 
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We can thus bound the total variation distance as 

= d TV (^(W -s),Po(a 2 + 7)) 

(3.11) 

= sup \EI[W -seB]- Po(a 2 + ~/){B}\ 
< sup \EAg A (W-s)\+F[W-s<0]. 

The last term in (3.11) can be bounded using Chebyshev's inequality as 

F[W - s < 0] = F[W -n< -{a 2 + 7)] 

(3.12) 

<F[\W-n\>a 2 + ~f] < 

From [2], Lemma 1.1.1, we have the well-known bounds on the supremum 
norm of g A , 

ll^||<(a 2 + 7 )- 1/2 <^ 1 , 

(3-13) 

1 — e~ a "~ 1 

\\Ag A \\< J- <<r~ 2 , 

a A +7 

where AgA(j) ■= gA(j + 1) — 9a{j)- If -A = {k} for some k E Z, we even have 
(3-14) \\g {k} \\<<J- 2 . 

For the proof of the results in the d\ oc metric, we will also need the following 
nonstandard but simple result. 

Lemma 3.7. Let g-i be the solution of (3.9)-(3.10) for A = {i}. Then 

(3.15) J2 \ A 9*( k )\ < 2 ^ E( A ^( fc )) 2 < 4 ^" 4 - 

k k 

Proof. Recall from [2], proof of Lemma 1.1.1, that g-i(k) is negative 
and decreasing in < k < i and positive and decreasing in k > i with the 
only positive jump in i satisfying 

|A< 7l (i)|<(cT 2 +7)- 1 <cT- 2 . 

From this, it is easy to see that the first bound of (3.15) holds and the second 
bound is then immediate. □ 

With g A (j) '■= 9A(j — s), we can rewrite the Stein operator A as 

Ag A (W -s) = (a 2 + i)g A {W -s + l)-(W- s)g A (W - s) 

(3.16) 

= a 2 Ag A (W) -(W- v)~9a(W) + jAg A (W). 
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The bounds on qa are of course the same as on gA in (3.13)-(3.15). Thus, 
the expectation of the last term in (3.16) is easily bounded by 

(3.17) \E{ 7 Ag A (W)}\ < ja' 2 < a~ 2 . 

Inserting (3.16) into (3.11) and invoking the bounds (3.12) and (3.17), we 
obtain 

d TV (Sf(W), TP (fi, a 2 )) 

(3.18) 

< sup \E{a 2 Ag A (W) - (W - v)g A (W)}\ + 2a~ 2 ; 

AcZ+ 

the same estimate holds for <ii oc but with the supremum taken only over the 
sets A = {i} for i 6 Z + . 

PROOF of Theorem 3.1. We only have to bound the supremum in 

(3.18) . In [17] it was shown that, if F satisfies F(w,w') = —F(w',w) for 
all w and w' , exchangeability implies MF(W, W) = 0. Define the random 
variable D := W' — W and the function F(w, w') := (w' — w)(g(w') + g(w )) 
for g = g A and note that, from (2.1), K W D = -X(W - fi) + R. This yields 

= EF{W, W) = E{D(2g(W) + g{W) - g{W))} 

(3.19) 

= -2\E{{W - n)g(W)} + 2E{Rg(W)} + E{D(g(W) - g(W))}. 
Note now that, for Di := I[D = i], i G {—1, +1}, we can write 

D(g(W) - g{W)) = D +l Ag{W) + D^Ag{W - 1), 
and further, using exchangeability, 

E{D-!Ag(W - 1)} = E{I[W' - W = -l]Ag(W - 1)} 

(3.20) =E{I[W -W = l]Ag(W')} 

= E{D +1 Ag(W)}, 

thus, 

(3.21) E{D(g(W) - g(W))} = 2E{D +1 Ag(W)}. 
Together with (3.19) this yields 

(3.22) mw - ,uw)) = m±iMm + msm . 

A A 
Note now that, by exchangeability, ED + i = ED_i and, hence, that 

ED +1 = \E{W' - W) 2 

(3.23) = i[E(I^' - /i) 2 - 2E{(W' -n){W- A*)} + bE{W - /x) 2 ] 
= Act 2 + E{(W - n)R} =: \a 2 + a, 
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from (2.1); then use (3.22) to express the expectation in (3.18) as 
E{(W-fi)g(W)-a 2 Ag(W)} 

= K{(W - fi)g(W) - (a 2 + A" 1 a)A 5 (VF)} + \" l aEAg(W) 

= E{(D +1 A~ 1 -a 2 - \- l a)&g{W)} + \~ l E{Rg{W)} + A" 1 aEA 5 (VK) 

=: B\ + B2 + -B3. 

Now, recall that S = K w D + i, and thus, with the estimates 

(3.24) |Bi| < WAgWX^S - ES\ < WAgWX^VV^S, 

(3.25) \B 2 \ < ||5||A _1 E|i?| < || 5 ||A _ VVarii, 

(3.26) \B 3 \ < ||Ap||A -1 E|(W - n)R\ < ||A^||A~VWariJ, 

and the bounds (3.13), (3.1) follows. 

To prove (3.2), we also use (3.18), but now we take the supremum only- 
over all subsets A = {i} for igZ. Writing g = guy and following the proof as 
for c?tv above, the bound on (3.25) remains and recalling (3.14), the third 
term in (3.2) follows. We thus need only refine the bounds on B\ and B3. 
Note that by the Cauchy-Schwarz inequality 

< A _ VVarS^E(Ap(W0) 2 . 
Using Lemma 3.7, the latter expectation can be bounded by 
E(A^)) 2 = Y,(^9(k)) 2 nW = k] 

(3.27) 

< q ma , x ^2(Ag(k)) 2 < 4ci" 4 g max , 
k 

which implies the first term in (3.2). Using a similar argument on B%, we 
obtain 

\B 3 \ < A^Vv / Var J R( 7max ^|A 5 (A : )|, 

k 

which, together with Lemma 3.7, yields the second term in (3.2). □ 

Remark 3.8. It is interesting to compare our approach to the one used 
in [8], who also use exchangeable pairs but for Poisson approximation. As we 
have TP(/x,/x) = Po(/x), it should be expected that we can reproduce their 
results. This is indeed the case. 

Now, assume our conditions (2.1) and (2.2) and assume that we are in 
the regime of Poisson approximation, that is, a 2 ~ fi. We also assume for 
the sake of simplicity that R = 0. Taking TP(/x,/i) as the approximating 
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distribution, it is easy to see that the Stein operator (3.16) reduces to the 
classical Stein operator 

Ag(w) := fig(w + 1) - wg(w) 

for Po(/i) also used in [8] . Using the anti-symmetric function from the proof 
of Theorem 3.1, we have 

= E{(D +1 - D^XgiW') + g(W))} 

(3.28) 

= E{D +l g(W + 1) - D^ l9 (W)}, 

where for the second equality we exploited the same argument as in (3.20) 
and the fact that W' = W + 1 if D + \ = 1. Note that (3.28) is the same equal- 
ity as in [8], obtained through a different anti-symmetric function. Multi- 
plying (3.28) by an arbitrary constant c, we obtain the bound 

d TV (~2f (W), Po(//)) = <hv(Sr(W),TP(ii, fj.)) = sup \EAg(W)\ 

9 

(3.29) < sup |E{(/i - cE w D +1 )g(W + 1) 

9 

+ {W -cE w D_ l )g(W)}\, 

where the supremum ranges over the same functions g as in (3.11). Note 
that, from (3.23), we have E w D + \ = Xa 2 ~ A/x and this, in conjunction with 
(2.1), also implies E w D~i ~ XW, so that, with the choice c = A -1 , (3.29) is 
expected to be reasonably small in the regime of Poisson approximation. 

Instead of (2.1) and (2.2), in [8] it is only assumed that (W, W) is ex- 
changeable. With this assumption, they prove the same bound (3.29), where 
again c can be chosen arbitrarily. It is noteworthy that, although differences 
of \W — W\ larger than 1 are allowed in their approach, again only jumps 
of size 1 appear in (3.29); this is a consequence of exchangeability. 

So, we are able to reproduce the estimates of [8] under our assumptions, 
by taking TP(//,/x) instead of TP(/i,c 2 ) as the approximation. However, 
we have the extra flexibility of being able to match mean and variance 
separately, so that our approach also works when a 2 is not near fi; for 
instance, if J^Pi is n °t small in the Poisson-binomial case. In contrast, in 
[8] they do not assume (2.1), and allow for differences larger than 1. 

Using (3.1) with the following corollary, one easily obtains a bound for 

Qmax- 



Corollary 3.9. For any Z-valued random variable W , 

(/max < d T v(J?(W), TP(H,CJ 2 )) + -j-. 

2.6a 



TRANSLATED POISSON APPROXIMATION 



11 



PROOF. Just apply Proposition A. 2. 7 of [2]. □ 

Remark 3.10. Estimate (3.2) in combination with Corollary 3.9 is 
enough to obtain a local limit theorem in the applications of the next section. 
Although it can be easily calculated in many circumstances, the example of 
the Poisson-binomial distribution shows that the bound on d\ oc need not be 
optimal; estimate (3.2) is of order 0(n~ 3 / 4 ) in the special case of the bi- 
nomial distribution, in contrast to the true order (9(n _1 ). Under additional 
assumptions on S, however, the bound (3.2) can be used to derive the bet- 
ter dioc-bound, given in the following theorem. This bound is used in the 
examples of the Sections 4.1 and 4.2 to obtain the correct order 0(n~ 1 ) of 
approximation . 

Theorem 3.11. Assume the conditions of Theorem 3.1; assume, in ad- 
dition, that S, as a function of W , can be extended on R such that it is 
Lipschitz continuous. Then, 



where d is the d\ oc -bound (3.2) and Ls is the Lipschitz constant of S. 

To obtain useful bounds from the above theorem, it is essential that one 
has a good bound on Ls- In the Sections 4.1 and 4.2 and in the special 
case of the anti- voter model on the complete graph (Example 4.7), this is 
easily done, because there we know S explicitly. Recall also Example 3.3 
for the binomial distribution, that is, Pi = p for some fixed p. Then (3.3) 
yields S*(J) = Xp(n -W) = S{W). Clearly, L s = Xp, so that from (3.30) we 
obtain the correct order 0{n~ l ) for the cZi oc -metric. For the general Poisson- 
binomial and anti-voter models from Section 4.3, however, we only know a 
more general function S*(J) with S(W) = ~E W S*(J) (see Remark 3.2), and 
it is unclear how to obtain useful bounds on Ls in these cases. 

To prove Theorem 3.11, we need the following lemma. 

Lemma 3.12. For any fi and a 2 , the bound 



(3.30) 




max 



2<7 m axV / Vari? VVari? 2 
Act + Act 2 + ct^' 



TP(fj,,a 2 ){k}\k- fi\ < 1 



holds for all k EZ. 



12 



A. ROLLIN 



Proof. Recall from (3.10) that, if Z ~ TP(/i,cr 2 ), 

(3.31) E{(Z - fj)g(Z) - (a 2 + 7 )A 9 (Z)} = 

for any g for which the expectations exist. With = TP(/x, cr 2 ){/c} and 
putting #(•) = /[• = k] we obtain from (3.31) the bound 

Kk\k -(J,\< {a 2 + 7)|7T fe _i - 7Tfe| 

< (ct 2 + 7 )d loc (TP(^ + 1, a 2 ), TP(/i, a 2 )) 

= (cT 2 +7)4oc(^+l),^(n), 

where y ~ Po(cr 2 + 7). The later di oc -distance can easily be bounded using 
Stein's method for the Poisson distribution, that is, (3.10) in connection 
with the bound (3.14), which yields 4, C (JS? (Y" + 1), J? (Y)) < (a 2 +7)" 1 . □ 



Proof of Theorem 3.11. Follow the proof of Theorem 3.1 for the di oc 
metric up to the bounds on the -Bj. The bounds on |i?2| an d 1-^3 1 remain. 
Recalling that S is a function defined on all R, write now B± as 

B x = A _1 E{(S(W0 - ES{W))Ag{W)} 

= \- l E{{S(W) - S(fi))Ag{W)} + A~ 1 E{(5(/i) - S(W))}EAg{W) 
= ■ B11 + Bi^. 

Exploiting Lipschitz continuity of S and recalling (3.15), we obtain with 
q k =F[W = k] 

\B 1>2 \ < X^aLs^Agik)] < 

k Xa 

which is the second term in (3.30). For -Bi,i, we have 

|Bi,i|<A- 1 X;?*|5 , (A:)-5(At)||A & (fc)| 
k 

(3.32) 

<\~ l L s Y,qk\k-^\\Ag{k)\. 
k 

We now bound q k \k — fx\. Assume first that \k — fi\ > a 3 / 2 ; then, by Cheby- 
shev's inequality, 

E\W-n\ 3 
\k — fi\ 



qk < W[W >k]< -L ^-PilW - fi\ > \k - n\ 



and, thus, 

?fc|fc-A*l < (T~ 3 ^\W - fi\ 3 . 
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On the other hand, if \k — fi\ < cr 3 / 2 , observe that 

q k <d+ TP (n,a 2 ){k} 
and, hence, using Lemma 3.12, 

Qk\k - At | < d(j' i/2 + 1. 
Thus, (3.32) can be further bounded to 

< X^Lsia'^W - Ai| 3 V (da 3 / 2 + 1)) ^ \Ag(k)\ 

k 

and applying again (3.15), this yields the first term in (3.30). □ 

The following lemma can be used to estimate the second and third mo- 
ments of W . 

Lemma 3.13. Under the assumptions of Theorem 3.1 and with A = {w : 
F[W = w] > 0} and a := E{R(W - /i)}, 

A- 1 ( inf S(w) -a)<a 2 < A" 1 ( sup S(w) - a) , 
\weA J \weA J 

E\W - az| 3 < r^Sfe + 1 + a + E{\R\(W - At) 2 }). 

Proof. The estimates for the variance are immediate from equality 
(3.23) and the bounds 

inf S(w) < ES{W) < sup S(w). 

weA w£A 

Note now that, from equation (3.22), 

E{(W - fi)g(W)} = X~ 1 E{S(W)Ag{W)} + A _1 E{i2g(W0} 

for all functions g, for which the expectations exist. With K^iw) = I[w > 
At] — I[w < At] and g(w) = K^(w)(w — fi) 2 , we thus obtain 

E\W - a*| 3 = A _1 E{S(W0[(W - At) 2 + 2(W - fi) + l]AK^(W)} 

+ \^E{S{W){2{W - At) + l)K^(W)} 

+ X^EiRiW - n) 2 Kp{W)} =: B[+B' 2 + B' 3 . 

Note now that |if (w)| = 1 and 

and thus, as | [n\ — fj,\ < 1 and < 1, 

\B[\ < 8A~ g max , 

\B' 2 \ < A _1 + A -1 (7. 
The bound on B' s is immediate. □ 
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4. Applications. In this section we illustrate our results using some ex- 
amples in which W = Ya=i Ji f° r a sequence J = ( J\ , J2, . . . , J n ) of random 
indicators. In [4] and [15], cases are considered where the Jj have a local 
dependence structure; in contrast, the examples in this paper exhibit global 
dependence. 

For latter use, we recall the following easy to prove fact. 

Lemma 4.1. Let /:R— >R be a Lipschitz continuous function with Lip- 
schitz constant Lf. Then, for any random variable X, 

Var/pf) <L}VarX 

4.1. Hypergeometric distribution. Assume that we have N urns and m 
balls, and that we distribute the balls uniformly into the N urns, in such 
a way that there is at most one ball per urn. Clearly, the number of balls 
W in the first n urns has the hypergeometric distribution Hyp(m, n, N), for 
which 

9 nm(N — n) (N — ml 

a 1 = Var W = - 



(N-l)N 2 

Theorem 4.2. IfW has the hypergeometric distribution, then (3.1) and 
(3.2) hold with R = and A = , AT N — ttt and we have 

nm{m + n) 2 (N — n)(N — m) 
( ' " m 2 (A-m + l) 2 (A-l)A 2 ' 

Thus, with N = N(n) >c n and m = m(n) x n, 

d TV (^W,TP( / u,a 2 ))=0(n- 1 /2 )) 

d loc (J?(W),TP( f i,a 2 ))=0(n- 1 ). 

Proof. Consider the following construction. Pick uniformly an urn with 
a ball, and put this ball into any empty urn (including the urn from which 
the ball was picked). Denote now by W' the number of balls in the first n 
urns. Exchangeability of (W, W) is easy to see and condition (2.2) is clearly 
satisfied. Now, W — W = 1 is the event that a ball is picked from one of the 
urns n + 1, . . . , N and put into one of the empty urns 1, . . . , n, thus, 

S(W)=F[W' = W + 1\W] 

. , . m — W n — W 
(4.2) = x 



m N — m + 1 
mn — (m + n)W + W 2 
m(N — m + 1) 
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and, conversely, 

W N — n — m + W 
F[W' = W- 1\W] = — x , 
L 1 J m N-m+1 

thus, 

E W (W' -W) = E W I[W' -W = l]- E W I[W' - W 
mn — NW 



m(N — m + 1) 

and (2.1) is satisfied with R = and A = m ^ N ^ m+1 ^ ■ 

Note now from (4.2) that S, as a function of W, is Lipschitz continuous 
with constant L$ = Tn (jv^m+i) ' thus, applying Lemma 4.1, we have 

(m + n)a 2 



Var5< 



?Ti 2 (Af-m+l) 2 ' □ 



This is enough to prove the d,TV-ordei and, together with Corollary 3.9, 
the order 0(n -3 / 4 ) for the di oc -metric. Now, noting that Lemma 3.13 yields 
E|VF — /x| 3 = 0(ra 3//2 ), we obtain from Theorem 3.11 the desired order 0(n _1 ) 
for the (ii oc -metric. 

4.2. A parity problem. Let Ji, . . . , J n be a sequence of independent Be(l/2)- 
distributed random indicators. Define 



J, 



n+l 



1, if ^ Ji is odd, 



i=l 

1 0, else, 



and V :=Y^i=i Ji, so V is simply obtained by "rounding" a Bi(n,l/2)- 
distributed random variable to the next even integer. An approximation of 
V by a translated Poisson distribution will clearly not succeed; however, we 
may try with W := \V . 

Regard now the following exchangeable pair coupling. Pick two random 
indices K, L uniformly on {1, . . . , n + 1} so that almost surely K ^ L, and 
define 

(4.3) ^' = ^ + 2-27^-2^; 

that is, take two summands of V at random, and replace each of them by 
its complement. 

Lemma 4.3. The pair (V,V) defined as above is an exchangeable pair 
and (W,W) :={\V,\V) satisfies (2.1) and (2.2) with A = 2/(n + 1). 
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Proof. It is enough to regard the situation on M = {0, l} n because 
the values J±,...,J n uniquely determine the random variable J n +i- Note 
first that construction (4.3) gives rise to a discrete time Markov chain on 
M, with jumps from j £ M to f G M, if j' differs from j in exactly one 
or two coordinates (J' differing in exactly one coordinate corresponds to K 
or L being equal to n + 1). Now, as the jump from j to j' happens with 
the same probability as from j' to j and all the states are connected, it is 
easy to see that the such defined Markov chain is irreducible and reversible 
and that the equilibrium distribution assigns equal probability to any j £ 
M, which corresponds to n independent Be(l/2) random variables. Thus, 
exchangeability is proved. 

Note now that 

r) n+ln+1 

2 AV 
-2nV = 2 ■ 



n(n +1) n + 1 

thus, we can take A = 2/(n + 1). □ 

Theorem 4.4. For W defined as above, ( 3.1 ) and ( 3.2 ) hold with R = 
and A = 2/(n + 1) and if n > 2, we have a 2 = (n + 1)/16 and 

VarS<^- 2 ^ + 1 )- 



16n 2 (n + l) 



thus, asn^oo, 



d T v(iW,TPGu,cT 2 )) =0(n- 1 / 2 ), 
d loc (J?(W),TP(fi,a 2 )) =0{n~ 1 ). 

Proof. Note first that if n > 2, the Jj are uncorrelated and, thus, 
a 2 = Var(T/)/4=(?i + l)/16. 

Now, 

n+l n+1 



^-^-^-Hor+ijEEa-w-*: 

l^k 

n(n + 1) - (4n - 2)W + AW 2 



n(n + 1) 



:S(W). 
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Observe that S, as a function of W, is Lipschitz continuous with L$ = n(n+i) » 
thus, applying Lemma 4.1, 

VaI 5(W)<(^|)V. 

n A (n + 1)^ 

This is enough to prove the c^rv-order and, together with Corollary 3.9, the 
order 0(n -3 / 4 ) for d\ oc . Now, noting that Lemma 3.13 yields E\W — /i| 3 = 
0(ra 3 / 2 ), we obtain from Theorem 3.11 the desired order 0(n~ l ) for the 
<il oc -metric. □ 

4.3. Anti-voter model on finite graphs. We closely follow the setup of 
[14]; sec also references therein and [13]. Let G be an n-vertex r-regular 
graph, which is neither bipartite nor an n-cycle. At each vertex i we assume 
that there is a "voter" attached, having an opinion which can take 
the values or 1 in every time point t £ N. Define a Markov chain by the 
following transition rule. Choose uniformly a random vertex, say, i; then, out 
of the neighborhood M% of i, choose uniformly a random vertex, say, j, and let 
be the opposite of and leave the other voters untouched. Assume 
now that the Markov chain is in its equilibrium and put W = Ya=1 Ji := 
V" 7-(°) 

Theorem 4.5. For the anti-voter model as described above, (3.1) and 
( 3.2 ) hold with R = and X = 2/n and we have 

f A a\ 16r 2 cj 2 +VarQ 
(44) VarS ^ WW ' 

where 

n 

Q = EE( 2J *- 1 )( 2J i- 1 ); 

i=lj€Afi 

hence, as n — > oo, 



dbcCJSf (W), TP( M , a 2 )) = O ( (Va 'fr + 



Remark 4.6. Note that the bound for drv i n Theorem 4.5 is very 
similar to the bound for the weaker Kolmogorov metric g?k given in Theorem 
1.3 of [14]; they obtain 



(4.5) d K (J?(W c )M0, 1)) = O (^p? + 4) ■ 

where W c = (W — fi)/a. 



18 



A. ROLLIN 



Example 4.7. Consider the sequence K n of complete graphs of size 
n. Rinott and Rotan [14] show that a 2 X n and Var Q = 0(n 3 ). Thus, from 
Theorem 3.1, the ^TV-distance is of the order 0{n~ l l 2 ) and the (ii oc -distance 
of order 0(n _3//4 ) which proves the LLT. Now, from (4.8) below, 

(4.6) 8 •U)^-" -f"-] )W + W2 =S(W), 

n(n — 1) 

and we can thus take L$ = -f^-. From Lemma 3.13, we obtain ¥]W — /i| 3 = 

0(n 3 / 2 ) and, therefore, Theorem 3.11 yields the order 0{n~ l ) for di oc . Note 
that the estimates on L$ are obtained only because of the explicit represen- 
tation (4.6); they are difficult to obtain in general. For further examples of 
graphs, see [14]. 

Proof of Theorem 4.5. Define W := £™ =1 jf } , and note that (W, W) 
is an exchangeable pair, satisfying (2.1) and (2.2) with the choices A = 2/n 
and R = (for more details, see [14]). Now, let K be the random index 
of the vertex that was resampled in the transition from W to W . As 

W' = W- J K + J { k\ 

5* (J) =E J I[W' - W= 1] 



(4.7) 



-j2^ J {I[J, = 0,4 1} = l]\K = i} 

n f— f 

1=1 

1 n 

-Y / (i-m J uP\K=i} 

1=1 



j 

With Xt = 2Ji - 1 and HP = YZ=i X ^ ( 4 - 7 ) becomes 

/ n n 



(4.8) =^(™-EE^-^+Q 



4rn 

rn - 2rW + Q 



4rn 

The variance of S* is thus 

Var5 (J) = iw 
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(4.9) 

16r 2 <7 2 + Var Q 
16r 2 n 2 

because ~E{XiXjXk\ = for any choice of i, j and k, due to the symmetry 
of the anti-voter model, and, hence, E{M^Q} = 0. □ 

Acknowledgments. I thank Andrew Barbour and the referee for many 
helpful suggestions and comments. 
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